Discriminative feature weighting for HMM-based continuous speech recognizers
نویسندگان
چکیده
The Discriminative Feature Extraction (DFE) method provides an appropriate formalism for the design of the frontend feature extraction module in pattern classification systems. In the recent years, this formalism has been successfully applied to different speech recognition problems, like classification of vowels, classification of phonemes or isolated word recognition. The DFE formalism can be applied to weight the contribution of the components in the feature vector. This variant of DFE, that we call Discriminative Feature Weighting (DFW), improves the pattern classification systems by enhancing those components more relevant for the discrimination among the different classes. This paper is dedicated to the application of the DFW formalism to Continuous Speech Recognizers (CSR) based on Hidden Markov Models (HMMs). Two different types of HMM-based speech recognizers are considered: recognizers based on Discrete-HMMs (DHMMs) (for which the acoustic evaluation is based on an Euclidean distance measure) and SemiContinuous-HMMs (SCHMMs) (for which the acoustic evaluation is performed making use of a mixture of multivariated Gaussians). We report how the components can be weighted and how the weights can be discriminatively trained and applied to the speech recognizers. We present recognition results for several continuous speech recognition tasks. The experimental results show the utility of DFW for HMM-based continuous speech recognizers. 2001 Elsevier Science B.V. All rights reserved.
منابع مشابه
IMPROVED HMM ENTROPY FOR ROBUST SUB−BAND SPEECH RECOGNITION (ThuPmOR1)
In recent years, sub−band speech recognition has been found useful in robust speech recognition, especially for speech signals contaminated by band−limited noise. In sub−band speech recognition, full band speech is divided into several frequency sub−bands and then sub−band feature vectors or their generated likelihoods by corresponding sub−band recognizers are combined to give the result of rec...
متن کاملRobust speech recognition using discriminative stream weighting and parameter interpolation
This paper presents a method to improve the robustness of speech recognition in noisy conditions. It has been shown that using dynamic features in addition to static features can improve the noise robustness of speech recognizers. In this work we show that in a continuous-density Hidden Markov Model (HMM) based speech recognition system, weighting the contribution of the dynamic features accord...
متن کاملSpeech recognition with a new hybrid architecture combining neural networks and continuous HMM
Abstract. In this paper, we focus on a novel NN/HMM architecture for continuous speech recognition. The architecture incorporates a neural feature extraction to gain more discriminative feature vectors for the underlying HMM system. The feature extraction can be chosen either linear or non-linear and can incorporate recurrent connections. With this hybrid system, that is an extension of a state...
متن کاملA NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction
This paper deals with a hybrid NN/HMM architecture for continuous speech recognition. We present a novel approach to set up a neural linear or nonlinear feature transformation that is used as a preprocessor on top of the HMM system’s RBF-network to produce discriminative feature vectors that are well suited for being modeled by mixtures of Gaussian distributions. In order to omit the computatio...
متن کاملDimensionality reduction of the enhanced feature set for the HMM-based speech recognizer
In the past few years, a great deal of research has been directed toward finding acoustic features that are effective for automatic speech recognition. Until recently, most of the speech recognizers used about 12 cepstral coefficients derived through the linear prediction analysis as recognition features [ 11. In [2,3], Furui investigated the use of temporal derivatives of cepstral coefficients...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 38 شماره
صفحات -
تاریخ انتشار 2002